Red Hat Enterprise Linux 7 Implementation

Process Management

Module Topics

Killing Processes
Monitoring Process Activity
Influencing Process Priority

Killing Processes

Using Signals to Control Processes

Signal: Software interrupt delivered to process that reports events to program
Events that generate signals:
- Errors
- External events (I/O request, expired timer, etc.)
- Explicit requests (signal-sending command, keystroke sequence)

Killing Processes

Signal Number

Short Name

Definition

Purpose

HUP

Hangup

Used to report termination of controlling process of terminal. Also used to request process reinitialization (configuration reload) without termination.

INT

Keyboard interrupt

Causes program termination. Can be blocked or handled. Sent by typing INTR character (Ctrl-c).

QUIT

Keyboard quit

Similar to SIGINT but produces process dump at termination. Sent by typing QUIT character (Ctrl-\).

KILL

Kill, unblockable

Causes abrupt program termination. Cannot be blocked, ignored, or handled; always fatal.

TERM

Terminate

Default. Causes program termination. Unlike SIGKILL, can be blocked, ignored, or handled. "Polite" way to ask program to terminate; allows self-cleanup.

CONT

Continue

Sent to process to resume if stopped. Cannot be blocked. Even if handled, always resumes process.

STOP

Stop, unblockable

Suspends process. Cannot be blocked or handled.

TSTP

Keyboard stop

Unlike SIGSTOP, can be blocked, ignored, or handled. Sent by typing SUSP character (Ctrl-z).

Killing Processes

Standardized Signal Names and Meanings

Signal numbers vary on Linux hardware platforms
Signal names and meanings are standardized
- Recommended: For commands, use signal names instead of numbers
Each signal has a default action:
- TERM: Causes program to terminate (exit) at once
- CORE: Causes program to save memory image (core dump), then terminate
- STOP: Causes program to stop executing (suspend) and wait to continue (resume)
Can prepare programs for expected event signals
- Implement handler routines to ignore, replace, or extend signal’s default action

Killing Processes

Sending Signals by Explicit Request

Users signal current foreground process by typing control sequence:
- To suspend, use Ctrl-z
- To kill, use Ctrl-c
- To core dump, use Ctrl-\
Signaling background process in different session requires signal-sending command
- Specify signal by name or number
Users can kill own processes
Need root privileges to kill processes owned by others

Killing Processes

kill Command

kill sends signal to process by ID

Can use kill to send any signal

Not just for terminating programs

[student@server1 ~]$ kill PID
[student@server1 ~]$ kill -signal PID
[student@server1 ~]$ kill -l
 1) SIGHUP      2) SIGINT      3) SIGQUIT     4) SIGILL      5) SIGTRAP
 6) SIGABRT     7) SIGBUS      8) SIGFPE      9) SIGKILL    10) SIGUSR1
11) SIGSEGV    12) SIGUSR2    13) SIGPIPE    14) SIGALRM    15) SIGTERM
16) SIGSTKFLT  17) SIGCHLD    18) SIGCONT    19) SIGSTOP    20) SIGTSTP
-- output truncated --

Killing Processes

killall Command

Use killall to send signal to one or more processes matching selection criteria:

Command name
Processes owned by specific user

All systemwide processes

[student@server1 ~]$ killall command_pattern
[student@server1 ~]$ killall -signal command_pattern
[root@server1 ~]# killall -signal -u username command_pattern

Killing Processes

pkill Command

pkill uses advanced selection criteria that can include these combinations:
- Command: Processes with pattern-matched command name
- UID: Processes owned by Linux user account, effective or real
- GID: Processes owned by Linux group account, effective or real
- Parent: Child processes of parent process
- Terminal: Processes running on specific controlling terminal
  [student@server1 ~]$ pkill command_pattern [student@server1 ~]$ pkill -signal command_pattern [root@server1 ~]# pkill -G GID command_pattern [root@server1 ~]# pkill -P PPID command_pattern [root@server1 ~]# pkill -t terminal_name -U UID command_pattern

Killing Processes

Logging Users Out Administratively

To view users currently logged in and their cumulative activities, use w
- To determine user’s location, use TTY and FROM columns
All users have controlling terminal
- Graphical environments are listed as pts/N
- System console, alternate console, or other directly connected terminal devices are listed as ttyN

To display remote users' connecting system name in FROM column, use w -f

[student@server1 ~]$ w -f
 12:3:06 up 27 min,  5 users,  load average: 0.03, 0.17, 0.66
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
student  :       :0               12:20   ?xdm?   1:10   0.16s gdm-session-wor
student  pts/0    :               12:20    2.00s  0.08s  0.01s w -f
root     tty2                      12:6   14:58   0.04s  0.04s -bash
bob      tty3                      12:8   14:42   0.02s  0.02s -bash
student  pts/1    desktop2.example.12:1    1:07   0.03s  0.03s -bash
[student@server1 ~]$

Killing Processes

Killing User Sessions

LOGIN@ column shows how long user has been on system
- JCPU column: CPU resources consumed by current jobs, including background tasks and children
- PCPU column: Current foreground process CPU consumption
Force users off system for security violations, resource overallocation, or other administrative needs
Can terminate sessions with signals
- If users cannot be contacted
- If users have unresponsive sessions, runaway resource consumption, or improper system access

Killing Processes

Using SIGTERM and SIGKILL

SIGKILL is commonly misused instead of SIGTERM
- SIGKILL cannot be handled or ignored; it is always fatal
- SIGKILL forces termination without self-cleanup routines
- Use SIGTERM first; use SIGKILL only if process fails to respond
Can signal processes and sessions individually or collectively
- To terminate all processes for one user, use pkill
- Initial process in login session handles session termination requests and ignores unintended keyboard signals
- To terminate all user processes and login shells, use SIGKILL
  [root@server1 ~]# pgrep -l -u bob 6964 bash 6998 sleep 6999 sleep 7000 sleep [root@server1 ~]# pkill -SIGKILL -u bob [root@server1 ~]# pgrep -l -u bob [root@server1 ~]#

Killing Processes

Selectively Killing User Sessions

May not need to kill all processes if processes are in same login session
To determine controlling terminal for session, use w
- Kill only processes that reference that terminal ID

Unless SIGKILL is specified, session leader handles and survives termination request

All other session processes are terminated

[root@server1 ~]# pgrep -l -u bob
7391 bash
7426 sleep
7427 sleep
7428 sleep
[root@server1 ~]# w -h -u bob
bob      tty3      18:7    5:04   0.03s  0.03s -bash
[root@server1 ~]# pkill -t tty3
[root@server1 ~]# pgrep -l -u bob
7391 bash
[root@server1 ~]# pkill -SIGKILL -t tty3
[root@server1 ~]# pgrep -l -u bob
[root@server1 ~]#

Killing Processes

Selectively Killing Child Sessions

Can apply same selective termination process to parent-child process relationships
To view process tree for system or single user, use pstree

To kill all child processes, use parent process’s PID

Parent bash login shell survives

Signal directed only at child processes

[root@server1 ~]# pstree -p bob
bash(8391)─┬─sleep(8425)
           ├─sleep(8426)
           └─sleep(8427)
[root@server1 ~]# pkill -P 8391
[root@server1 ~]# pgrep -l -u bob
bash(8391)
[root@server1 ~]# pkill -SIGKILL -P 8391
[root@server1 ~]# pgrep -l -u bob
bash(8391)
[root@server1 ~]#

References

GNU C Library Reference Manual
- info libc signal, Section 24: Signal Handling
- info libc processes, Section 26: Processes
Man pages kill(1), killall(1), pgrep(1), pkill(1), pstree(1), signal(7), and w(1)

Monitoring Process Activity

Load Average

Linux kernel calculates load average metric as exponential moving average of load number
System activity data is calculated as follows:

Data

Description

Active requests

Counted from per-CPU queues for running threads and threads waiting for I/O as kernel tracks process resource activity and corresponding process state changes

Load number

Calculation routine run every 5 seconds by default that accumulates and averages active resource requests into single number for all CPUs

Exponential moving average

Mathematical formula that smoothes out trending data highs and lows, increases current activity significance, and decreases aging data quality

Load average

Result of load number calculation routine. Collectively, load average refers to three displayed values of system activity data averaged for last 1, 5, and 15 minutes

Transcript:

The Linux kernel calculates a load average metric as an exponential moving average of the load number, a cumulative CPU count of active system resource requests.

Active requests are counted from per-CPU queues for running threads and threads waiting for I/O, as the kernel tracks process resource activity and corresponding process state changes.
Load number is a calculation routine run every five seconds by default, which accumulates and averages the active requests into a single number for all CPUs.
Exponential moving average is a mathematical formula to smooth out trending data highs and lows, increase current activity significance, and decrease aging data quality.
Load average is the load number calculation routine result. Collectively, it refers to the three displayed values of system activity data averaged for the last 1, 5, and 15 minutes.

Monitoring Process Activity

Understanding the Linux Load Average Calculation

Load average represents perceived system load over time
Load average is expected service wait times for CPU, disk, and network I/O
Linux counts processes and individual threads as separate tasks
- CPU request queues for running threads (nr_running) and threads waiting for I/O resources (nr_iowait) correspond to process states:
  - R (Running)
  - D (Uninterruptable Sleeping)
- Waiting-for-I/O includes tasks sleeping while waiting for disk and network responses

Monitoring Process Activity

Load number is global counter calculation sum-totaled for all CPUs
- Tasks returning from sleep may reschedule to different CPUs
  - Per-CPU counts are difficult
  - Cumulative count is assured
- Displayed load averages represent all CPUs
Linux counts each physical CPU core and microprocessor hyperthread as separate execution units
- Logically represented and referred to as individual CPUs
- Each CPU has independent request queues

Monitoring Process Activity

Use /proc/cpuinfo to see kernel representation of system CPUs:

[student@server1 ~]$ grep "model name" /proc/cpuinfo
model name  :Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name  :Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name  :Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name  :Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
[student@server1 ~]$ grep "model name" /proc/cpuinfo | wc -l
4

Some UNIX systems considered only CPU utilization or run-queue length to indicate system load
Linux uses I/O load average
- Systems with idle CPUs can experience extensive waiting due to busy disk or network resources
Examine disk and network activity if load averages are high with minimal CPU activity

Monitoring Process Activity

Interpreting Load Average

Load average values represent weighted values over last 1, 5, and 15 minutes
Calculate approximate per-CPU load value to determine if system is experiencing significant waiting

Use top, uptime, w, and gnome-system-monitor to display load average values:

[student@server1 ~]$ uptime
 15:9:03 up 14 min,  2 users,  load average: 2.92, 4.48, 5.20

Monitoring Process Activity

Divide load average by number of logical CPUs in system

Value below 1 indicates satisfactory resource utilization and minimal wait times

Value above 1 indicates resource saturation and some service wait times

# From /proc/cpuinfo, system has four logical CPUs, so divide by 4:
#                               load average:2.92, 4.48, 5.20
#           divide by number of logical CPUs:   4     4     4
#                                             ----  ----  ----
#                       per-CPU load average:0.73  1.12  1.30
#
# This system's load average appears to be decreasing.
# With a load average of 2.92 on four CPUs, all CPUs were in use ~73% of the time.
# During the last 5 minutes, the system was overloaded by ~12%.
# During the last 15 minutes, the system was overloaded by ~30%.

Monitoring Process Activity

Idle CPU queue has load number of 0
- Each waiting thread adds count of 1
With total queue count of 1, resource (CPU, disk, or network) is in use
- No requests are waiting
Additional requests increment count
- Can process many requests within time period, so resource utilization increases
- Wait times do not increase

Monitoring Process Activity

Processes sleeping for I/O due to busy disk or network are included in count and increase load average
- Queue count is not indication of CPU utilization
  - Indicates users and programs are waiting for resources
Until resource saturation, load average remains below 1 because tasks seldom wait in queue
Load average increases only when resource saturation causes requests to remain queued and counted by load calculation routine
When resource utilization approaches 100%, each additional request experiences service wait time

Monitoring Process Activity

Real-time Process Monitoring

top provides dynamic view of system processes
- Displays summary header followed by process or thread list similar to ps
- Continuously refreshes, unlike static ps output
- Allows column reordering, sorting, and highlighting
User configurations can be saved and made persistent

Monitoring Process Activity

Column

Description

PID

Process ID

USER

Username, process owner

VIRT

Virtual memory

All memory process is using, including resident set, shared libraries, and mapped or swapped memory pages

Labeled VSZ in the ps command.

RES

Resident memory

Physical memory used by process, including resident shared objects

Labeled RSS in the ps command.

S

Process state, displays as:

D: Uninterruptable sleeping
R: Running or runnable
S: Sleeping
T: Stopped or traced
Z: Zombie

TIME

CPU time, total processing time since process started

Can toggle to include cumulative time of all previous children

COMMAND

Process command name

Transcript:

Default output columns are recognizable from other resource tools and the column headings are described below:

The process ID (PID).
User name (USER) is the process owner.
Virtual memory (VIRT) is all memory the process is using, including the resident set, shared libraries, and any mapped or swapped memory pages. This is labeled VSZ in the ps command.
Resident memory (RES) is the physical memory used by the process, including any resident shared objects. This is labeled RSS in the ps command.
Process state (S) displays as:
- D = Uninterruptable Sleeping
- R = Running or Runnable
- S = Sleeping
- T = Stopped or Traced
- Z = Zombie
CPU time (TIME) is the total processing time since the process started. May be toggled to include cumulative time of all previous children.
The process command name (COMMAND).

Monitoring Process Activity

Key

Purpose

? or h

Help for interactive keystrokes

l, t, m

Toggle showing load, threads, and memory header lines

Toggle showing individual CPUs or summary for all CPUs in header

s ⁽¹⁾

Change screen refresh rate, in decimal seconds (e.g., 0.5, 1, 5)

Toggle reverse highlighting for running processes (default is bold only)

Enable use of bold in display, in header, and for running processes

Toggle showing process summary or individual threads

u, U

Filter for any user name (effective, real)

Sort processes by memory usage in descending order

Sort processes by processor utilization in descending order

k ⁽¹⁾

Kill process (enter PID, then signal)

r ⁽¹⁾

renice a process (enter PID, then nice_value)

Write (save) current display configuration for use at next top restart

Quit

⁽¹⁾ Not available if top was started in secure mode

References

GNOME System Monitor, yelp help:gnome-system-monitor
Man pages ps(1), top(1), uptime(1), and w(1)

Influencing Process Priority

Displaying Nice Levels with top

Most process management tools (like gnome-system-monitor) display nice levels by default
To interactively view and manage processes, use top
By default top displays two columns relating to nice level:
- NI: Actual nice level
- PR: Nice level mapped to larger priority queue
-20 nice level maps to priority 0
+19 nice level maps to priority 39

Influencing Process Priority

Displaying Nice Levels with ps

ps can display nice levels
- Do not display in most default output formats
Can request required columns from ps
- Name of nice field is nice

Example: Request list of processes with PID, name, and nice level, sorted in descending order by nice level

[student@desktop1 ~]$ ps axo pid,comm,nice --sort=-nice
  PID COMMAND          NI
   74 khugepaged       19
  688 alsactl          19
 1953 tracker-miner-f  19
   73 ksmd              5
  714 rtkit-daemon      1

Influencing Process Priority

Some processes might report - (dash) as nice level
- These processes run with different scheduling policy
  - Often considered higher priority by scheduler
To display scheduler policy, request cls field from ps
- TS in field indicates process runs under SCHED_NORMAL and can use nice levels
- Anything else means different scheduler policy was used

Influencing Process Priority

Launching Processes with a Different Nice Level

When process starts, it inherits nice level from parent
When process starts from command line, it inherits nice level of shell process it was started from
- Usually results in new processes running with nice level 0
To start process with different nice level, use nice
- Without options, nice <COMMAND> starts <COMMAND> with nice level 10
- To select other nice levels, use -n <NICELEVEL>
- Example: Start dogecoinminer with nice level 15 and send to background
  [student@desktop1 ~]$ nice -n 15 dogecoinminer &
Unprivileged users can set only positive nice levels (0 to 19)
Only root can set negative nice levels (-20 to -1)

Transcript:

Whenever a process is started, it will normally inherit the nice level from its parent. This means that when a process is started from the command line, it will get the same nice level as the shell process that it was started from. In most cases, this will result in new processes running with a nice level of 0.

To start a process with a different nice level, both users and system administrators can run their commands using the nice tool. Without any other options, running nice <COMMAND> will start <COMMAND> with a nice level of 10. Other nice levels can be selected by using the -n <NICELEVEL> option to the nice command. For example, to start the command dogecoinminer with a nice level of 15 and send it to the background immediately, the following command can be used.

Important

Unprivileged users are only allowed to set a positive nice level (0 to 19). Only root can set a negative nice level (-20 to -1).

Influencing Process Priority

Changing the Nice Level of a Process

To change nice level of process from command line, use renice
- Syntax for renice is renice -n <NICELEVEL> <PID>...
- Example: Change nice level of origami@home processes to -7
  [root@desktop1 ~]# renice -n -7 $(pgrep origami@home)
  You can specify more than one PID.
Regular users can only raise nice level, and only on own processes
Only root can lower nice level

Influencing Process Priority

To change nice level interactively, use top
- Press r
- Type PID to be changed
- Type new nice level

References

Man pages: nice(1), renice(1), and top(1)

Summary

Killing Processes
Monitoring Process Activity
Influencing Process Priority

Module Completion

Nice job!

Click the button below to complete this module of the course: